rank | frequency | n-gram |
---|---|---|
1 | 62534 | -n |
2 | 37547 | -a |
3 | 34568 | -e |
4 | 34333 | -i |
5 | 25507 | -ı |
rank | frequency | n-gram |
---|---|---|
1 | 16229 | -in |
2 | 14561 | -ın |
3 | 12831 | -an |
4 | 11700 | -en |
5 | 9167 | -da |
rank | frequency | n-gram |
---|---|---|
1 | 6879 | -nin |
2 | 6695 | -dan |
3 | 6187 | -den |
4 | 6082 | -nın |
5 | 5195 | -lar |
rank | frequency | n-gram |
---|---|---|
1 | 3583 | -inin |
2 | 3298 | -leri |
3 | 3297 | -ları |
4 | 3194 | -ının |
5 | 2622 | -ndan |
rank | frequency | n-gram |
---|---|---|
1 | 2202 | -ların |
2 | 2173 | -lerin |
3 | 1885 | -inden |
4 | 1859 | -ından |
5 | 1747 | -arını |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings